Draft

Add buffers to a shapefile in Python

Adding buffers around points is essential for species distribution models. This post shows how to add buffers with different radii around the red and green Kangaroo’s Paw using Python.

Eukaryota
Plantae
Charophyta
Commelinales
Haemodoraceae
Chordata
Anigozanthos
Buffers
Python
Authors

Amanda Buyan

Dax Kellie

Published

November 23, 2023

Author

Amanda Buyan
Dax Kellie

Date

23 November 2023

Buffers are used in many ecological spaces. Most notably, they are used in Species Distribution Models to extend the possible range of rare or threatened species to create a better model. They can also be used to capture occurrences or areas which may have been neglected by a user’s original data.

In this post, we show how to use {galah-python}, {geopandas}, {shapely} and {matplotlib} to add a buffer to occurrences of Kangaroo Paw, specifically the Red and Green Kangaroo Paw.

Choosing the right taxon

Kangaroo Paw is a common name for a number of species, represented by the genus Anigozanthos. They are perennial plants, native to the south-west of Western Australia. They are unique, bird-attracting flowers which open at the apex with six claw-like structures resembling kangaroo paws, hence their name!

To begin, we will look for the number of occurrences for the genus Anigozanthos. To do this, we will first start by loading galah-python.

import galah
galah.galah_config(email="your-email-here")

Now, we will use the atlas_counts() function to see a count of the number of occurrences of all species of Kangaroo’s Paw in the ALA.

galah.atlas_counts(taxa="Anigozanthos")
   totalRecords
0          5952

It is surprising to note that there are only 84 occurrences coming up for all species in this genus! To ensure that we have the correct name, we will use search_taxa() to see if there is anything going on with this genus.

galah.search_taxa(taxa="Anigozanthos")
  scientificName scientificNameAuthorship                                    taxonConceptID   rank  kingdom      phylum         order         family         genus vernacularName   issues
0   Anigozanthos                  Labill.  https://id.biodiversity.org.au/node/apni/2890004  genus  Plantae  Charophyta  Commelinales  Haemodoraceae  Anigozanthos   Kangaroo Paw  noIssue

If you scroll across to the rank column that is returned by this function, unfortunately, the search item that comes up for search_taxa() is unranked, which explains why we are getting very few records! What we can do now to refine our search is add a higher order of taxonomic classification to our query. By searching on the ALA species pages, we know that Anigozanthos is a genus belonging to the Haemodoraceae family. galah-python has an argument to search_taxa() called scientific_name, which allows the user to disambiguate their query.

Now, if we add the extra taxonomic information, we get thte following query:

galah.search_taxa(scientific_name={"family": ["Haemodoraceae"],"genus": ["Anigozanthos"],"scientificName": ["Anigozanthos"]})
  scientificName scientificNameAuthorship                                    taxonConceptID   rank  kingdom      phylum         order         family         genus vernacularName   issues
0   Anigozanthos                  Labill.  https://id.biodiversity.org.au/node/apni/2890004  genus  Plantae  Charophyta  Commelinales  Haemodoraceae  Anigozanthos   Kangaroo Paw  noIssue

Here, if we look at the information, we can see that our scientificName matches the genus we want, and the rank is now listed as genus. Now, if we get the total record count using this disambiguation:

galah.atlas_counts(scientific_name={"family": ["Haemodoraceae"],"genus": ["Anigozanthos"],"scientificName": ["Anigozanthos"]})
   totalRecords
0          5952

Another, shorter, way to write this query is by using the scientificName and scientificNameAuthorship as the taxon name. To test that we do get the same number of records, we can simply run

galah.atlas_counts(taxa="Anigozanthos Labill.")
   totalRecords
0          5952

Downloading occurrence records

Now that we know the correct genus, we can start to download occurrence records. Since there are ~6000 records, let’s choose a smaller subset to draw buffers around. Since a breeding program was started in 2007 by Kings Park Botanic and Garden Board to protect the Kangaroo Paw from disease and the impact of climate changes, let’s only include occurrence records starting from 2007 onwards.

galah.atlas_counts(taxa="Anigozanthos Labill.",filters="year>=2007")
   totalRecords
0          2374

1936 is a more manageable number. However, there are different ways to observe a species. Since we want to only include ones humans have observed in the wild, we will add the filter basisOfRecord=HUMAN_OBSERVATION

galah.atlas_counts(taxa="Anigozanthos Labill.",filters=["year>=2007","basisOfRecord=HUMAN_OBSERVATION"])
   totalRecords
0          2102

Now, we can download all occurrences of all species of Kangaroo Paw and plot it on a map to do an initial check that we have all records in Kangaroo Paw’s natural habitat, southwest WA.

import geopandas as gpd
import matplotlib.pyplot as plt
kanga_paw_occ = galah.atlas_occurrences(taxa="Anigozanthos Labill.",filters=["year>=2007","basisOfRecord=HUMAN_OBSERVATION"])
states = gpd.read_file("STE_2021_AUST_GDA2020.shp")
states = states.to_crs(4326)
ax = states.plot(edgecolor = "#5A5A5A", linewidth = 0.5, facecolor = "white", figsize = (12,6))
plt.scatter(kanga_paw_occ['decimalLongitude'],kanga_paw_occ['decimalLatitude'], c = "red")

Here, we can see that the ALA has records that are outside WA, and if we want to only draw a buffer region around species in Western Australia, we will have to filter out the points in the east. To do this, we will provide the polygon representing WA to atlas_occurrences() to remove the occurrences seen in the Eastern parts of Australia.

kanga_paw_occ_pol = galah.atlas_occurrences(
    taxa="Anigozanthos Labill.",
    filters=["year>=2007","basisOfRecord=HUMAN_OBSERVATION"],
    polygon = states[states["STE_NAME21"] == "Western Australia"]["geometry"][4],
    simplify_polygon=True,
)
ax = states[states["STE_NAME21"] == "Western Australia"].plot(edgecolor = "#5A5A5A", linewidth = 0.5, facecolor = "white", figsize = (12,6))
plt.scatter(kanga_paw_occ_pol['decimalLongitude'],kanga_paw_occ_pol['decimalLatitude'], c = "red")

Now, we have the occurrences we want.

galah.atlas_counts(
    taxa="Anigozanthos Labill.",
    filters=["year>=2007","basisOfRecord=HUMAN_OBSERVATION"],
    polygon = states[states["STE_NAME21"] == "Western Australia"]["geometry"][4],
    simplify_polygon=True,
    group_by="species",
    expand=False
)
                      species  count
0        Anigozanthos bicolor     86
1       Anigozanthos flavidus    179
2      Anigozanthos gabrielae      2
3        Anigozanthos humilis    602
4  Anigozanthos kalbarriensis      1
5      Anigozanthos manglesii    599
6       Anigozanthos preissii     31
7   Anigozanthos pulcherrimus      9
8          Anigozanthos rufus    110
9        Anigozanthos viridis     35

We can see that the most recorded species is Anigozanthos manglesii, which is the Red and Green Kangaroo Paw and the state emblem of WA. Since there are a decent number of records, but not as many as the entire genus, let’s choose to focus on the Red and Green Kangaroo Paw.

anigozanthos_manglesii = galah.atlas_occurrences(
    taxa = "Anigozanthos manglesii",
    filters=["year>=2007","basisOfRecord=HUMAN_OBSERVATION"],
    polygon = states[states["STE_NAME21"] == "Western Australia"]["geometry"][4],
    simplify_polygon=True
)

We can visualise these points to check that they are in the southwest corner of WA.

ax = states[states["STE_NAME21"] == "Western Australia"].plot(edgecolor = "#5A5A5A", linewidth = 0.5, facecolor = "white", figsize = (12,6))
plt.scatter(anigozanthos_manglesii['decimalLongitude'],anigozanthos_manglesii['decimalLatitude'], c = "red")

Adding Buffers

Now that we’ve got our occurrences, we can start adding buffers around these points. To do this, we will need the {shapely} package in Python. First, we will convert all of the decimalLongitude and decimalLatitude points to Point objects from {shapely}.

import shapely
from shapely.geometry import Point,Polygon
points_angiozanthos_manglesii = [Point((row['decimalLongitude'],row['decimalLatitude'])) for i,row in anigozanthos_manglesii[['decimalLongitude','decimalLatitude']].iterrows()]

Now that we have a list of all the points, we can add them to a geopandas DataFrame. This is so we can manipulate spatial data more easily than in pandas Dataframes. We will also set the Coordinate Reference System (CRS) to EPSG:4326, as this is the CRS of all ALA data points.

gdf_anigozanthos_manglesii = gpd.GeoDataFrame(anigozanthos_manglesii,geometry=points_angiozanthos_manglesii)
gdf_anigozanthos_manglesii.set_crs(epsg=4326, inplace=True)
     decimalLatitude  decimalLongitude             eventDate                          scientificName                                    taxonConceptID                              recordID                   dataResourceName occurrenceStatus                     geometry
0         -34.509491        117.012040  2019-10-22T10:37:53Z                  Anigozanthos manglesii  https://id.biodiversity.org.au/node/apni/2900921  24ccb120-2e37-4959-a198-a72629276e8e              iNaturalist Australia          PRESENT  POINT (117.01204 -34.50949)
1         -34.509391        117.012412  2019-10-22T10:36:58Z                  Anigozanthos manglesii  https://id.biodiversity.org.au/node/apni/2900921  2c5cb0fe-01bd-4927-9ab7-607beaadb315              iNaturalist Australia          PRESENT  POINT (117.01241 -34.50939)
2         -34.507979        116.907720  2021-10-30T14:55:00Z                  Anigozanthos manglesii  https://id.biodiversity.org.au/node/apni/2900921  57581a54-c592-4547-8864-f0f2034c91cf              iNaturalist Australia          PRESENT  POINT (116.90772 -34.50798)
3         -34.448681        116.989865  2023-10-29T13:01:00Z                  Anigozanthos manglesii  https://id.biodiversity.org.au/node/apni/2900921  30a2c3ad-6386-4fdf-90a7-4fa81febed7b              iNaturalist Australia          PRESENT  POINT (116.98986 -34.44868)
4         -34.437661        116.954487  2022-10-29T13:53:16Z                  Anigozanthos manglesii  https://id.biodiversity.org.au/node/apni/2900921  4097f747-ab3d-44da-bba6-98bedaf094ba              iNaturalist Australia          PRESENT  POINT (116.95449 -34.43766)
..               ...               ...                   ...                                     ...                                               ...                                   ...                                ...              ...                          ...
594       -27.677102        114.272375  2024-09-22T17:28:42Z                  Anigozanthos manglesii  https://id.biodiversity.org.au/node/apni/2900921  c4befee3-8225-4fc0-9c49-02dfae247127              iNaturalist Australia          PRESENT  POINT (114.27237 -27.67710)
595       -27.663992        114.295055  2012-10-14T10:32:00Z                  Anigozanthos manglesii  https://id.biodiversity.org.au/node/apni/2900921  1aba7449-a3e1-4127-8ac6-d50bef684987              iNaturalist Australia          PRESENT  POINT (114.29506 -27.66399)
596       -27.607720        114.420170  2011-09-20T13:44:00Z  Anigozanthos manglesii subsp. quadrans  https://id.biodiversity.org.au/node/apni/2894068  de617c68-bb47-4403-b355-9974bfba6df0              iNaturalist Australia          PRESENT  POINT (114.42017 -27.60772)
597       -27.564815        114.425332  2016-09-10T16:27:00Z  Anigozanthos manglesii subsp. quadrans  https://id.biodiversity.org.au/node/apni/2894068  f9c14a93-baf9-41c1-b3a1-fe7a9709ffa7              iNaturalist Australia          PRESENT  POINT (114.42533 -27.56481)
598       -27.260710        114.049158  2022-08-20T16:00:00Z                  Anigozanthos manglesii  https://id.biodiversity.org.au/node/apni/2900921  2788b724-4bbe-4979-aef7-9a9b34617ae7  ALA species sightings and OzAtlas          PRESENT  POINT (114.04916 -27.26071)

[599 rows x 9 columns]

Now, to add buffers of a certain radius, we have to convert our current CRS (which represents coordinates in degrees) to a CRS that represents coordinates in meters. This is so so we can directly add buffers in meters. For our example, we will be using EPSG:3577, which is Australian Albers, and a widely used CRS in Australia when one needs to use meters as a unit. This is likely different around the world, so be sure to check what CRS is right for your area.

gdf_anigozanthos_manglesii_meters = gdf_anigozanthos_manglesii.to_crs(3577)

Now, we will be creating five different buffers around our chosen points: 5km, 10km, 15km, 20km, and 25km. For each buffer radius, we will be adding a circle of the chosen radius around each point. We will then put them in a geodataframe so we can easily convert the buffers back from meters into degrees so they agree with our original points. Then, we will perform something called a unary_union, which is a function in shapely that allows you to unify many shapes into one shape object. We will then be storing this in a dictionary.

buffer_shapes = {}
buffer_lengths = {"5km": 5000, "10km": 10000,"15km": 15000,"20km": 20000,"25km": 20000}
for length in buffer_lengths:
  buffers = [row["geometry"].buffer(buffer_lengths[length]) for i,row in gdf_anigozanthos_manglesii_meters.iterrows()]
  gdf_buffers = gpd.GeoSeries(buffers).set_crs(3577)
  gdf_buffers_degrees = gdf_buffers.to_crs(4326)
  union_buffers_degrees = shapely.unary_union(gdf_buffers_degrees)
  buffer_shapes[length] = union_buffers_degrees

Now, we can finally plot the buffers! Below is a code using the dictionary we just created to run a for loop over all the buffers to easily plot them, and also colour them in different colours.

ax = states[states["STE_NAME21"] == "Western Australia"].plot(edgecolor = "#5A5A5A", linewidth = 0.5, facecolor = "white", figsize = (12,6))
ax.set_ylim([-34,-27])
ax.set_xlim([114,119])
colors = ["red","orange","green","purple","black"]
plt.scatter(anigozanthos_manglesii['decimalLongitude'],anigozanthos_manglesii['decimalLatitude'], c = "blue", s = 2)
for i,length in enumerate(buffer_lengths):
  for j,geom in enumerate(buffer_shapes[length].geoms):
    if j==0:
      plt.plot(*geom.exterior.xy,c=colors[i],lw=0.5,label=length)
    else:
      plt.plot(*geom.exterior.xy,c=colors[i],lw=0.5)
plt.legend(fontsize=16)

Final thoughts

We hope this point has helped make the steps of adding buffers around points on a map clearer and easier. For more advanced mapping in Python, check out our ALA Labs article on how to map invasive species.

Expand for session info
─ Session info ───────────────────────────────────────────────────────────────
 setting  value
 version  R version 4.4.1 (2024-06-14 ucrt)
 os       Windows 10 x64 (build 19045)
 system   x86_64, mingw32
 ui       RTerm
 language (EN)
 collate  English_Australia.utf8
 ctype    English_Australia.utf8
 tz       Australia/Sydney
 date     2024-10-29
 pandoc   3.1.11 @ C:/Program Files/RStudio/resources/app/bin/quarto/bin/tools/ (via rmarkdown)

─ Packages ───────────────────────────────────────────────────────────────────
 package     * version date (UTC) lib source
 htmltools   * 0.5.8.1 2024-04-04 [1] CRAN (R 4.4.1)
 sessioninfo * 1.2.2   2021-12-06 [1] CRAN (R 4.3.2)

 [1] C:/Users/KEL329/R-packages
 [2] C:/Users/KEL329/AppData/Local/Programs/R/R-4.4.1/library

─ Python configuration ───────────────────────────────────────────────────────
 python:         C:/Users/KEL329/OneDrive - CSIRO/Documents/ALA/Github/ala-labs/.venv/Scripts/python.exe
 libpython:      C:/Users/KEL329/AppData/Local/Programs/Python/Python312/python312.dll
 pythonhome:     C:/Users/KEL329/OneDrive - CSIRO/Documents/ALA/Github/ala-labs/.venv
 version:        3.12.0 (tags/v3.12.0:0fb18b0, Oct  2 2023, 13:03:39) [MSC v.1935 64 bit (AMD64)]
 Architecture:   64bit
 numpy:          C:/Users/KEL329/OneDrive - CSIRO/Documents/ALA/Github/ala-labs/.venv/Lib/site-packages/numpy
 numpy_version:  1.26.2
 
 NOTE: Python version was forced by VIRTUAL_ENV

──────────────────────────────────────────────────────────────────────────────